The spei
and spi
functions allow computing the SPEI and the SPI indices. These are climatic proxies widely used for drought quantification and monitoring. Both functions are identical (in fact, spi
is just a wrapper for spei
), but they are kept separated for clarity. Basically, the functions standardize a variable following a log-Logistic (or Gamma, or PearsonIII) distribution function (i.e., they transform it to a standard Gaussian variate with zero mean and standard deviation of one).
Input data
The input variable is a time ordered series of precipitation values for spi
, or a series of the climatic water balance (precipitation minus potential evapotranspiration) for spei
. When used with the default options, it would yield values of both indices exactly as defined in the references given below.
The SPEI and the SPI were defined for monthly data. Since the PDFs of the data are not homogenous from month to month, the data is split into twelve series (one for each month) and independent PDFs are fit to each series. If data
is a vector or a matrix it will be treated as a sequence of monthly values starting in January. If it is a (univariate or multivariate) time series then the function cycle
will be used to determine the position of each observation within the year (month), allowing the data to start in a month other than January.
Time scales
An important advantage of the SPEI and the SPI is that they can be computed at different time scales. This way it is possible to incorporate the influence of the past values of the variable in the computation enabling the index to adapt to the memory of the system under study. The magnitude of this memory is controlled by parameter scale
. For example, a value of six would imply that data from the current month and of the past five months will be used for computing the SPEI or SPI value for a given month. By default all past data will have the same weight in computing the index, as it was originally proposed in the references below. Other kernels, however, are available through parameter kernel
. The parameter kernel
is a list defining the shape of the kernel and a time shift. These parameters are then passed to the function kern
.
Probability distributions
Following the original definitions spei
uses a log-Logistic distribution by default, and spi
uses a Gamma distribution. This behaviour can be modified, however, through parameter distribution
.
Fitting methods
The default method for parameter fitting is based on unbiased Probability Weighted Moments ('ub-pwm'), but other methods can be used through parameter fit
. A valid alternative is the plotting-position PWM ('pp-pwm') method. For the log-Logistic distribution, also the maximum likelihood method ('max-lik') is available.
User-provided parameters
An option exists to override parameter fitting and provide user default parameters. This is activated with the parameter params
. The exact values provided tothis parameter depend on the distribution function being used. For log-Logistic and PearsonII it should be a three-dimensional array with dimensions (3,number of series in data,12), containing twelve parameter triads (xi, alpha, kappa) for each data series, one for each month. For Gamma, a three-dimensional array with dimensions (2,number of series in data,12), containing twelve parameter pairs (alpha, beta). It is a good idea to look at the coefficients slot of a previously fit spei
spei object in order to understand the structure of the parameter array. The parameter distribution
is still used under this option in order to know what distribution function should be used.
Reference period
The default behaviour of the functions is using all the values provided in data
for parameter fitting. However, this can be modified with help of parameters ref.start
and ref.end
. These parameters allow defining a subset of values that will be used for parameter fitting, i.e. a reference period. The functions, however, will compute the values of the indices for the whole data set. For these options to work it is necessary that data
will be a time series object. The starting and ending points of the reference period will then be defined as pairs of year and month values, e.g. c(1900,1).
Processing large datasets
It is possible to use the spei
and spi
functions for processing multivariate datasets at once. If a matrix or data frame is supplied as data
, with time series of precipitation or precipitation minus potential evapotranspiration arranged in columns, the result would be a matrix (data frame) of spi or spei series. This makes processing large datasets extremely easy, since no loops need to be used.